-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/staking: lazily get consensus key address #9264
Conversation
cc @odeke-em |
Codecov Report
@@ Coverage Diff @@
## master #9264 +/- ##
==========================================
- Coverage 60.16% 60.16% -0.01%
==========================================
Files 594 594
Lines 37106 37109 +3
==========================================
+ Hits 22325 22326 +1
- Misses 12805 12806 +1
- Partials 1976 1977 +1
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thank you @cuonglm, LGTM!
if err != nil { | ||
return err | ||
} | ||
|
||
strKey := string(consPk.Bytes()) | ||
|
||
if _, ok := addrMap[strKey]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can even further avoid the []byte->string allocation by using
_, ok := addrMap[string(consPk.Bytes())]
if ok {
...
}
...
addrMap[string(consPk.Bytes())] = ...
which will mos definitely show an improvement in the hot path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the []byte
-> string
trick is not applicable, because consPk.Bytes()
is a method call, the compiler need to allocate temporary variable for that, which is roughly like:
tmp := consPk.Bytes()
strKey := string(tmp)
Here's the benchmark result (comparing with this PR):
name old time/op new time/op delta
ValidateGenesis10Validators-8 1.13µs ± 1% 1.09µs ± 0% -3.46% (p=0.008 n=5+5)
ValidateGenesis100Validators-8 10.1µs ± 0% 10.3µs ± 1% +2.90% (p=0.008 n=5+5)
ValidateGenesis400Validators-8 38.3µs ± 0% 40.5µs ± 0% +5.76% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
ValidateGenesis10Validators-8 667B ± 0% 667B ± 0% ~ (all equal)
ValidateGenesis100Validators-8 6.22kB ± 0% 6.22kB ± 0% ~ (p=0.159 n=5+5)
ValidateGenesis400Validators-8 24.4kB ± 0% 24.4kB ± 0% ~ (p=0.810 n=5+5)
name old allocs/op new allocs/op delta
ValidateGenesis10Validators-8 13.0 ± 0% 13.0 ± 0% ~ (all equal)
ValidateGenesis100Validators-8 105 ± 0% 105 ± 0% ~ (all equal)
ValidateGenesis400Validators-8 408 ± 0% 408 ± 0% ~ (all equal)
it actually run slower because we call the method 2 times, which will introduce 2 temp vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point about it being a method call! However at least with a single variable then directly doing the map []->string call should work unless the compiler is getting faulty but I don’t see how doing m[string(bKey)] will make things slower it should eliminate the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@odeke-em Hmm, not sure I got what you mean. What make thing slower because we call consPk.Bytes()
2 times, which will introduce 2 temp vars.
With those temp vars, the m[string([]byte)]
optimization is applied already, that's why you can see your suggestion and the PR have the same alloc in benchmark result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you see my second suggestion? Essentially adapt accordingly so:
bKey := consPk.Bytes()
m[string(bKey)]
...
m[string(bKey)]
No way that's not going to not eliminate the 2 []byte->string conversions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With those temp vars, the m[string([]byte)] optimization is applied already, that's why you can see your suggestion and the PR have the same alloc in benchmark result.
I don't believe so fully, you are doing
strKey := string(...)
Then using strKey around, we don't have to incur that conversion at all given that strKey is being used as the key instead of m[string(bKey)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@odeke-em What you do:
bKey := consPk.Bytes()
then do string(bKey)
in lookup, is already the same thing with what string(consPk.Bytes())
does. Because the compiler rewrite it to:
tmp := consPk.Bytes()
strKey := string(tmp)
the compiler can realize that tmp
isn't modified elsewhere, so the string conversion make no allocation. Here's the benchmark result with your second suggestion:
name old time/op new time/op delta
ValidateGenesis10Validators-8 1.14µs ± 0% 1.09µs ± 0% -4.12% (p=0.008 n=5+5)
ValidateGenesis100Validators-8 10.1µs ± 0% 11.3µs ±15% +12.02% (p=0.008 n=5+5)
ValidateGenesis400Validators-8 38.4µs ± 0% 40.5µs ± 1% +5.42% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
ValidateGenesis10Validators-8 667B ± 0% 667B ± 0% ~ (all equal)
ValidateGenesis100Validators-8 6.22kB ± 0% 6.22kB ± 0% ~ (p=1.000 n=5+5)
ValidateGenesis400Validators-8 24.4kB ± 0% 24.4kB ± 0% ~ (p=0.206 n=5+5)
name old allocs/op new allocs/op delta
ValidateGenesis10Validators-8 13.0 ± 0% 13.0 ± 0% ~ (all equal)
ValidateGenesis100Validators-8 105 ± 0% 105 ± 0% ~ (all equal)
ValidateGenesis400Validators-8 408 ± 0% 408 ± 0% ~ (all equal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tell me then, what does the next m[strKey] down below amount to instead of m[string(bKey)] much down below where there is not an edit? Not a hill am willing to die on and later on can be revised but would be nice to examine all the code paths.
The consensus key address is only used for error reporting, but benchmarking shown that it takes a big number of running time. By getting the consensus lazily, the validation process both run faster and use less memory. ValidateGenesis10Validators-8 2.06µs ± 0% 1.13µs ± 1% -45.15% (p=0.008 n=5+5) ValidateGenesis100Validators-8 19.1µs ± 0% 10.1µs ± 0% -47.47% (p=0.008 n=5+5) ValidateGenesis400Validators-8 75.7µs ± 2% 38.3µs ± 0% -49.43% (p=0.008 n=5+5) name old alloc/op new alloc/op delta ValidateGenesis10Validators-8 987B ± 0% 667B ± 0% -32.42% (p=0.008 n=5+5) ValidateGenesis100Validators-8 9.42kB ± 0% 6.22kB ± 0% -33.97% (p=0.008 n=5+5) ValidateGenesis400Validators-8 37.2kB ± 0% 24.4kB ± 0% -34.37% (p=0.008 n=5+5) name old allocs/op new allocs/op delta ValidateGenesis10Validators-8 23.0 ± 0% 13.0 ± 0% -43.48% (p=0.008 n=5+5) ValidateGenesis100Validators-8 205 ± 0% 105 ± 0% -48.78% (p=0.008 n=5+5) ValidateGenesis400Validators-8 808 ± 0% 408 ± 0% -49.50% (p=0.008 n=5+5) Fixes #9263
Description
The consensus key address is only used for error reporting, but
benchmarking shown that it takes a big number of running time.
By getting the consensus lazily, the validation process both run faster
and use less memory.
Fixes #9263
Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.
docs/
) or specification (x/<module>/spec/
)godoc
comments.Unreleased
section inCHANGELOG.md
Files changed
in the Github PR explorerCodecov Report
in the comment section below once CI passes